Add batch span processor benchmarks #3017

sbandadd · 2021-03-11T21:13:44Z

Description:
This PR adds two benchmarks.

Current benchmark executs forceFlush() on every loop and creates a bottleneck which results in not stressing batch span processor. Current benchmark only
measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue.
Measuring CPU usage of exporter thread is also important, but the current benchmarks consumes as much CPU as possible which makes the measurement not meaningful.
To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler
to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose.

This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one.

sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorCpuBenchmark.java

sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorMetrics.java

anuraaga · 2021-03-12T03:27:15Z

sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorMetrics.java

+
+  private long getMetric(boolean dropped) {
+    String labelValue = String.valueOf(dropped);
+    Optional<Long> value =


I have a feeling two loops will be very similar code while more readable

The existing code is two loop and it took me a good amount of time to debug why dropped/exported spans metrics are invalid and got a bit confused on what the two loops are doing. Functional would make it easy to see what are getting filtered and how the data is getting mapped.

That said, I would leave this to the maintainers.

...e/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorMultiThreadBenchmark.java

Description: This PR adds two benchmarks. 1. Current benchmark executs forceFlush() on every loop and creates a bottleneck which results in not stressing batch span processor. Current benchmark only measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue. 2. Measuring CPU usage of exporter thread is also important, but the current benchmarks consumes as much CPU as possible which makes the measurement not meaningful. To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose. This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one. This PR also fixes a big in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other ones.

codecov · 2021-03-15T06:03:44Z

Codecov Report

Merging #3017 (408915f) into main (0d687bd) will increase coverage by 90.72%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             main    #3017       +/-   ##
===========================================
+ Coverage        0   90.72%   +90.72%     
- Complexity      0     2813     +2813     
===========================================
  Files           0      324      +324     
  Lines           0     8774     +8774     
  Branches        0      883      +883     
===========================================
+ Hits            0     7960     +7960     
- Misses          0      551      +551     
- Partials        0      263      +263

Impacted Files	Coverage Δ	Complexity Δ
...xtension/incubator/trace/data/SpanDataBuilder.java	`63.63% <0.00%> (ø)`	`7.00% <0.00%> (?%)`
.../java/io/opentelemetry/sdk/resources/Resource.java	`94.44% <0.00%> (ø)`	`12.00% <0.00%> (?%)`
...lemetry/sdk/trace/samplers/ParentBasedSampler.java	`100.00% <0.00%> (ø)`	`21.00% <0.00%> (?%)`
...opentelemetry/opencensusshim/OpenTelemetryCtx.java	`90.00% <0.00%> (ø)`	`4.00% <0.00%> (?%)`
...lemetry/sdk/common/InstrumentationLibraryInfo.java	`100.00% <0.00%> (ø)`	`4.00% <0.00%> (?%)`
...lemetry/sdk/autoconfigure/EnvironmentResource.java	`100.00% <0.00%> (ø)`	`2.00% <0.00%> (?%)`
.../sdk/metrics/SynchronousInstrumentAccumulator.java	`84.21% <0.00%> (ø)`	`8.00% <0.00%> (?%)`
...y/sdk/autoconfigure/SpanExporterConfiguration.java	`72.54% <0.00%> (ø)`	`13.00% <0.00%> (?%)`
...ry/opencensusshim/OpenTelemetryContextManager.java	`88.88% <0.00%> (ø)`	`9.00% <0.00%> (?%)`
...ntelemetry/sdk/trace/SdkTracerProviderBuilder.java	`89.65% <0.00%> (ø)`	`10.00% <0.00%> (?%)`
... and 314 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d687bd...408915f. Read the comment docs.

anuraaga

Having fixed-throughput overhead benchmarks is very helpful, thanks!

jkwatson · 2021-03-17T02:37:25Z

I finally got to spend some dedicated time on this today, and got jmh hooked up with the async-profiler profiling option, and I do see some significant overhead introduced by the usage of the ArrayBlockingQueue (although it's been tough to get really consistent results).

thanks for putting in the work on this. Now, to agree on the best solution to the issue. :)

anuraaga · 2021-03-17T05:47:36Z

@jkwatson Approved of these benchmarks offline so will go ahead and merge. Benchmarks are alive so I'm sure we'll continue to tweak but this is an improvement. Thanks @sbandadd

sbandadd requested review from anuraaga, arminru, bogdandrutu, carlosalberto, jkwatson, Oberon00, pavolloffay, thisthat and tylerbenson as code owners March 11, 2021 21:13

sbandadd force-pushed the sbandadd-add-benchmarks branch 2 times, most recently from eab438a to b96b976 Compare March 11, 2021 23:31

anuraaga reviewed Mar 12, 2021

View reviewed changes

sbandadd force-pushed the sbandadd-add-benchmarks branch 2 times, most recently from d33590f to be485ea Compare March 12, 2021 21:34

sbandadd force-pushed the sbandadd-add-benchmarks branch from be485ea to 408915f Compare March 15, 2021 05:50

anuraaga approved these changes Mar 15, 2021

View reviewed changes

anuraaga merged commit 23ce8fe into open-telemetry:main Mar 17, 2021

This was referenced Dec 19, 2021

Temurin JDK #4011

Merged

use Eclipse Temurin JDK docker image #4012

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch span processor benchmarks #3017

Add batch span processor benchmarks #3017

sbandadd commented Mar 11, 2021 •

edited

Loading

anuraaga Mar 12, 2021

sbandadd Mar 12, 2021 •

edited

Loading

codecov bot commented Mar 15, 2021

anuraaga left a comment

jkwatson commented Mar 17, 2021

anuraaga commented Mar 17, 2021

Add batch span processor benchmarks #3017

Add batch span processor benchmarks #3017

Conversation

sbandadd commented Mar 11, 2021 • edited Loading

anuraaga Mar 12, 2021

Choose a reason for hiding this comment

sbandadd Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Mar 15, 2021

Codecov Report

anuraaga left a comment

Choose a reason for hiding this comment

jkwatson commented Mar 17, 2021

anuraaga commented Mar 17, 2021

sbandadd commented Mar 11, 2021 •

edited

Loading

sbandadd Mar 12, 2021 •

edited

Loading